122 research outputs found

    On the surplus value of semantic video analysis beyond the key frame

    Get PDF
    Typical semantic video analysis methods aim for classification of camera shots based on extracted features from a single key frame only. In this paper, we sketch a video analysis scenario and evaluate the benefit of analysis beyond the key frame for semantic concept detection performance. We developed detectors for a lexicon of 26 concepts, and evaluated their performance on 120 hours of video data. Results show that, on average, detection performance can increase with almost 40 % when the analysis method takes more visual content into account. 1

    A Family of Maximum Margin Criterion for Adaptive Learning

    Full text link
    In recent years, pattern analysis plays an important role in data mining and recognition, and many variants have been proposed to handle complicated scenarios. In the literature, it has been quite familiar with high dimensionality of data samples, but either such characteristics or large data have become usual sense in real-world applications. In this work, an improved maximum margin criterion (MMC) method is introduced firstly. With the new definition of MMC, several variants of MMC, including random MMC, layered MMC, 2D^2 MMC, are designed to make adaptive learning applicable. Particularly, the MMC network is developed to learn deep features of images in light of simple deep networks. Experimental results on a diversity of data sets demonstrate the discriminant ability of proposed MMC methods are compenent to be adopted in complicated application scenarios.Comment: 14 page

    Visual Word Ambiguity

    Full text link

    Constant force muscle stretching induces greater acute deformations and changes in passive mechanical properties compared to constant length stretching

    Get PDF
    Stretching is applied to lengthen shortened muscles in pathological conditions such as joint contractures. We investigated (i) the acute effects of different types of stretching, i.e. constant length (CL) and constant force (CF) stretching, on acute deformations and changes in passive mechanical properties of medial gastrocnemius muscle (MG) and (ii) the association of acute muscle–tendon deformations or changes in mechanical properties with the impulse or maximal strain of stretching. Forty-eight hindlimbs from 13 male and 12 female Wistar rats (13 weeks old, respectively 424.6 ± 35.5 and 261.8 ± 15.6 g) were divided into six groups (n = 8 each). The MG was initially stretched to a length at which the force was 75%, 95%, or 115% of the force corresponding to estimated maximal dorsiflexion and held at either CF or CL for 30 min. Before and after the stretching protocol, the MG peak force and peak stiffness were assessed by lengthening the passive muscle to the length corresponding to maximal ankle dorsiflexion. Also, the muscle belly length and tendon length were measured. CF stretching affected peak force, peak stiffness, muscle belly length, and tendon length more than CL stretching (p &lt; 0.01). Impulse was associated only with the decrease in peak force, while maximal strain was associated with the decrease in peak force, peak stiffness, and the increase in muscle belly length. We conclude that CF stretching results in greater acute deformations and changes in mechanical properties than CL stretching, which appears to be dependent predominantly on the differences in imposed maximal strain.</p

    Edge and corner detection by photometric quasi-invariants

    Full text link

    Visual Image Search: Feature Signatures or/and Global Descriptors

    Get PDF
    The success of content-based retrieval systems stands or falls with the quality of the utilized similarity model. In the case of having no additional keywords or annotations provided with the multimedia data, the hard task is to guarantee the highest possible retrieval precision using only content-based retrieval techniques. In this paper we push the visual image search a step further by testing effective combination of two orthogonal approaches – the MPEG-7 global visual descriptors and the feature signatures equipped by the Signature Quadratic Form Distance. We investigate various ways of descriptor combinations and evaluate the overall effectiveness of the search on three different image collections. Moreover, we introduce a new image collection, TWIC, designed as a larger realistic image collection providing ground truth. In all the experiments, the combination of descriptors proved its superior performance on all tested collections. Furthermore, we propose a re-ranking variant guaranteeing efficient yet effective image retrieval

    Validating the detection of everyday concepts in visual lifelogs

    Get PDF
    The Microsoft SenseCam is a small lightweight wearable camera used to passively capture photos and other sensor readings from a user's day-to-day activities. It can capture up to 3,000 images per day, equating to almost 1 million images per year. It is used to aid memory by creating a personal multimedia lifelog, or visual recording of the wearer's life. However the sheer volume of image data captured within a visual lifelog creates a number of challenges, particularly for locating relevant content. Within this work, we explore the applicability of semantic concept detection, a method often used within video retrieval, on the novel domain of visual lifelogs. A concept detector models the correspondence between low-level visual features and high-level semantic concepts (such as indoors, outdoors, people, buildings, etc.) using supervised machine learning. By doing so it determines the probability of a concept's presence. We apply detection of 27 everyday semantic concepts on a lifelog collection composed of 257,518 SenseCam images from 5 users. The results were then evaluated on a subset of 95,907 images, to determine the precision for detection of each semantic concept and to draw some interesting inferences on the lifestyles of those 5 users. We additionally present future applications of concept detection within the domain of lifelogging. © 2008 Springer Berlin Heidelberg

    An Image Statistics–Based Model for Fixation Prediction

    Get PDF
    The problem of predicting where people look at, or equivalently salient region detection, has been related to the statistics of several types of low-level image features. Among these features, contrast and edge information seem to have the highest correlation with the fixation locations. The contrast distribution of natural images can be adequately characterized using a two-parameter Weibull distribution. This distribution catches the structure of local contrast and edge frequency in a highly meaningful way. We exploit these observations and investigate whether the parameters of the Weibull distribution constitute a simple model for predicting where people fixate when viewing natural images. Using a set of images with associated eye movements, we assess the joint distribution of the Weibull parameters at fixated and non-fixated regions. Then, we build a simple classifier based on the log-likelihood ratio between these two joint distributions. Our results show that as few as two values per image region are already enough to achieve a performance comparable with the state-of-the-art in bottom-up saliency prediction

    Evaluating Multimedia Features and Fusion for Example-Based Event Detection

    Get PDF
    Multimedia event detection (MED) is a challenging problem because of the heterogeneous content and variable quality found in large collections of Internet videos. To study the value of multimedia features and fusion for representing and learning events from a set of example video clips, we created SESAME, a system for video SEarch with Speed and Accuracy for Multimedia Events. SESAME includes multiple bag-of-words event classifiers based on single data types: low-level visual, motion, and audio features; high-level semantic visual concepts; and automatic speech recognition. Event detection performance was evaluated for each event classifier. The performance of low-level visual and motion features was improved by the use of difference coding. The accuracy of the visual concepts was nearly as strong as that of the low-level visual features. Experiments with a number of fusion methods for combining the event detection scores from these classifiers revealed that simple fusion methods, such as arithmetic mean, perform as well as or better than other, more complex fusion methods. SESAME’s performance in the 2012 TRECVID MED evaluation was one of the best reported
    corecore